The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Clusters composed of fast personal computers are now well established as cheap and efficient platforms for distributed and parallel applications. The main draw-back of a standard NOWs is the poor performance of the standard inter-process communication mechanisms based on RPC, sockets, TCP/IP, Ethernet. Such standard communication mechanisms perform poorly both in terms of through-put as well as message...
In this document we make a brief review of memory management and DMA considerations in case of common SCI hardware and the Virtual Interface Architecture. On this basis we expose our ideas for an improved memory management of a hardware combining the positive characteristics of both basic technologies in order to get one completely new design rather than simply adding one to the other. The described...
While standard processors achieve supercomputer performance, a performance gap exists between the interconnect of MPP’s and COTS. Standard solutions like Ethernet can not keep up with the demand for high speed communication of todays powerful CPU’s. Hence, high speed interconnects have an important impact on a cluster’s performance. While standard solutions for processing nodes exist, communication...
Parallel processing is based on utilizing a group of processors to efficiently solve large problems faster than is possible on a single processor. To accomplish this, the processors must communicate and coordinate with each other through some type of network. However, the only function that most networks support is message routing. Consequently, functions that involve data from a group of processors...
The recent interest and growing popularity of commodity-based cluster computing has created a demand for low-latency, high-bandwidth interconnect technologies. Early cluster systems have used expensive but fast interconnects such as Myrinet or SCI. Even though these technologies provide low-latency, high-bandwidth communications, the cost of an interface card almost matches that of individual computers...
Many common implementations of Message P assing Interface (MPI) implement collectiv e operations over poin t-to-poin toperations. This work examines IP multicast as a framework for collectiv e operations. IP multicast is not reliable. If a receiver is not ready when a message is sent via IP multicast, the message is lost. Two techniques for ensuring that a message is not lost due to a slow receiving...
One of the c hallenges in large scale distributed computing is to utilize the thousands of idle personal computers. In this paper, we presen t a system that enables users to effortlessly and safely export their machines in a global market of processing capacity. Efficient resource allocation is performed based on statistical machine profiles and leases are used to promote dynamic task placement. The...
One of the new research tendencies within the well-established cluster computing area is the growing interest in the use of multiple workstation clusters as a single virtual parallel machine, in much the same way as individual workstations are nowadays connected to build a single parallel cluster. In this paper we present an analysis on several aspects concerning the integration of different workstation...
This paper presents an efficient parallel information retrieval (IR) system which provides fast information service for the Internet users on low-cost high-performance PC-NOW environment. The IR system is implemented on a PC cluster based on the Scalable Coherent Interface (SCI), a powerful interconnecting mechanism for both shared memory models and message passing models. In the IR system, the inverted-index...
In this paper we study the use of networks of PCs to handle the parallel execution of relational database queries. This approach is based on a parallel extension, called parallel relational query evaluator, working in a coupled mode with a sequential DBMS. We present a detailed architecture of the parallel query evaluator and introduce Enkidu, the efficient Java-based prototype that has been build...
In recent years, new parallel and distributed computational models have been proposed in the literature, reflecting advances in new computational devices and environments such as optical interconnects, FPGA devices, networks of workstations, radio communications, DNA computing, quantum computing, etc. New algorithmic techniques and paradigms have been recently dev eloped for these new models.
Trends in parallel computing indicate that heterogeneous parallel computing will be one of the most widespread platforms for computation-intensive applications. A heterogeneous computing environment offers considerably more computational power at a lower cost than a parallel computer. We propose the Heterogeneous Bulk Synchronous Parallel (HBSP) model, which is based on the BSP model of parallel computation,...
We investigate the issue of stalling in the LogP model. In particular, we introduce a novel quantitative characterization of stalling, referred to as δ-stalling, which intuitively captures the realistic assumption that once the network’s capacity constraint is violated, it takes some time (at most δ) for this information to propagate to the processors involved. We prove a lower bound that shows that...
In this paper, we consider parallelizability of some P-complete problems. First we propose a parameter which indicates parallelizability for a convex layers problem. We prove P-completeness of the problem and propose a cost optimal parallel algorithm, according to the parameter. Second we consider a lexicographically first maximal 3 sums problem. We prove P-completeness of the problem by reducing...
The main contribution of this paper is in designing an optimal and/or optimal speed-up algorithm for computing shape moments. We introduce a new technique for computing shape moments. The new technique is based on the quadtree representation of images. We decompose the image into squares, since the moment computation of squares is easier than that of the whole image. The proposed sequential algorithm...
Consider a network of nodes; each node represents a philosopher; links represent the neighboring relationship among the philosophers. Every philosopher enjoys singing so much that once getting the chance, he always sings a song within a finite delay. This paper proposes a protocol for the philosophers to follow. The protocol guarantees the following requirements: (1) No two neighboring philosophers...
Recently, many efficient parallel algorithms on the reconfigurable mesh have been developed. How ever, it is not easy to understand the behavior of a reconfigurable mesh. This is mainly because the bus topology can change dynamically during the execution of algorithm. In this work, we have developed JRM, a Java applet for visualizing parallel algorithm on the reconfigurable mesh to help on understanding...
A PRAM (Parallel Random Access Machine) [4] is the parallel computational model most notable for supporting the parallel algorithmic theory. It consists of a number of processors sharing a common memory. The processors communicate by exchanging data through a shared memory cell. Each processor can access any memory cell at one unit of time and all processors operate synchronously under the control...
In this paper we present a novel parallel arithmetic architecture using an efficient non-binary logic scheme. We show that by using parallel broadcasting (or domino propagating) state signals, on short reconfigurable buses equipped with a type of switches, called GP (generate-propagate) shift switches, several arithmetic operations can be carried out efficiently. We extend a recently proposed shift...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.